Skip to content

IT: fix flaky HeartbeatTests.HeartbeatFailed test#424

Merged
wprzytula merged 1 commit intomasterfrom
dk/fix-heartbeat-failed-flaky-test
Feb 25, 2026
Merged

IT: fix flaky HeartbeatTests.HeartbeatFailed test#424
wprzytula merged 1 commit intomasterfrom
dk/fix-heartbeat-failed-flaky-test

Conversation

@dkropachev
Copy link
Contributor

@dkropachev dkropachev commented Feb 25, 2026

Summary

  • Fix flaky HeartbeatTests.Integration_Cassandra_HeartbeatFailed test by using 1 connection per host (with_core_connections_per_host(1)) instead of the default per-shard pool
  • The default pool size causes asynchronous shard-aware connection establishment to race with the baseline total_connections count read, making the count non-deterministic and the assertion flaky

Fixes #395

Test plan

  • Run HeartbeatTests.Integration_Cassandra_HeartbeatFailed repeatedly to verify it no longer flakes

@dkropachev dkropachev force-pushed the dk/fix-heartbeat-failed-flaky-test branch from d6c3616 to e88160b Compare February 25, 2026 02:52
@dkropachev dkropachev marked this pull request as ready for review February 25, 2026 02:58
Copy link
Contributor

@wprzytula wprzytula left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explanations from the cover letter and the commit message are different, although not mutually exclusive. I'm confused.
Cover letter mentions completing TCP handshakes to SIGSTOP'ed nodes, while the commit message mentions additional connections opened asynchronously to shards.

@dkropachev dkropachev force-pushed the dk/fix-heartbeat-failed-flaky-test branch from e88160b to 4fd4dc2 Compare February 25, 2026 16:13
Use 1 connection per host instead of the default per-shard pool to
avoid asynchronous shard-aware connection establishment racing with
the baseline connection count read.

Fixes #395
@dkropachev dkropachev force-pushed the dk/fix-heartbeat-failed-flaky-test branch from 4fd4dc2 to 10bcf2e Compare February 25, 2026 16:16
@wprzytula
Copy link
Contributor

Explanations from the cover letter and the commit message are different, although not mutually exclusive. I'm confused. Cover letter mentions completing TCP handshakes to SIGSTOP'ed nodes, while the commit message mentions additional connections opened asynchronously to shards.

@dkropachev can you explain?

@dkropachev
Copy link
Contributor Author

dkropachev commented Feb 25, 2026

Explanations from the cover letter and the commit message are different, although not mutually exclusive. I'm confused. Cover letter mentions completing TCP handshakes to SIGSTOP'ed nodes, while the commit message mentions additional connections opened asynchronously to shards.

@dkropachev can you explain?

Forgot to update it, I have changed code twice since PR description was created.
Please check now

@wprzytula wprzytula merged commit b71bb07 into master Feb 25, 2026
9 checks passed
@wprzytula wprzytula deleted the dk/fix-heartbeat-failed-flaky-test branch February 25, 2026 20:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

HeartbeatTests.Integration_Cassandra_HeartbeatFailed is flaky

3 participants